International Journal of Electrical and Computer Engineering (IJECE) 
Vol. 12, No. 5, October 2022, pp. 4978~4987 
ISSN: 2088-8708, DOI: 10.1159 I/ijece.v12i5.pp4978-4987 O 4978 


A new procedure for lung region segmentation from computed 
tomography images 


Mohd Firdaus Abdullah!, Siti Noraini Sulaiman!, Muhammad Khusairi Osman’, 


Noor Khairiah Abdul Karim, Samsul Setumin!, Iza Sazanita Isa! 
'School of Electrical Engineering, Universiti Teknologi MARA, Cawangan Pulau Pinang, Malaysia 
?Advanced Medical and Dental Institute, Universiti Sains Malaysia, Pulau Pinang, Malaysia 


Article Info ABSTRACT 

Article history: Lung cancer is the leading cause of cancer death among people worldwide. 
. The primary aim of this research is to establish an image processing method 

Received Jul 14, 2021 for lung cancer detection. This paper focuses on lung region segmentation 

Revised Mar 17, 2022 from computed tomography (CT) scan images. In this work, a new 

Accepted Apr 14, 2022 procedure for lung region segmentation is proposed. First, the lung CT scan 


images will undergo an image thresholding stage before going through two 


morphological reconstruction and masking stages. In between morphological 
Keywords: and masking stages, object extraction, border change, and object elimination 
will occur. Finally, the lung field will be annotated. The outcomes of the 
proposed procedure and previous lung segmentation methods i.e., the 
modified watershed segmentation method is compared with the ground truth 


CT scan image 
Image masking 
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Image thresholding and quantitative manners. Based on the analyses, the new proposed 
Lung cancer procedure for lung segmentation, denotes better performance, an increment 
Lung segmentation by 0.02% to 3.5% in quantitative analysis. The proposed procedure produced 
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frequently selected method by the 22 experts. This study shows that the 
outcome from the proposed method outperforms the existing modified 
watershed segmentation method. 
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1. INTRODUCTION 

The most common cancer in the world is lung cancer. It is the second most common disease in men 
and women [1]. Smoking addiction, radioactive gas, and air pollution are the leading cause of lung cancer 
[2], [3]. Based on previous studies, computed tomography (CT) images are used to detect lung cancer 
[4]-[6]. It has already become a necessity for humans in medical imaging worldwide. Before any treatment is 
given to the lung cancer patient, the patient will be diagnosed by the doctor to determine the tumor’s location, 
shape, and size by using imaging modalities. The imaging modalities include CT scan [5]-[9], magnetic 
resonance imaging (MRI) [10] and X-ray [6], [11]. However, a CT scan is preferred because it is easy to use 
and provides accurate classification and foreign mass location [3]. Besides, the use of MRI is expensive and 
less available than CT scans and time-consuming diagnostics [12]. The CT image is used to record images 
and for the radiologists to perform diagnoses. Using a CT scan, several types of tissues such as lung, bone, 
soft tissues, and blood vessels can be shown with great clarity, which cannot be seen, in conventional 
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radiographs [13]. These scans yield a large amount of image data. The growing volume of data from CT 
exams has enabled radiologists to analyze nodules, but this appears to be impractical [14]. As a result, 
computer-aided detection (CAD) was used to help radiologists with nodule analyses. 

One of the most critical areas in the medical field is image segmentation. Its outcome will have an 
impact on several important decisions made by medical experts. Hence, image segmentation is an essential 
step in the development of a CAD system [15]-[17] and has become one of the popular methods used on the 
medical image to enhance the identification of abnormalities such as pleural effusions, consolidations, and 
masses in a particular region [18]. Techniques for image enhancement and segmentation for the detection of 
cancer in the lungs such as image scaling, color space transformation, contrast enhancement [19], threshold 
[18], [20], and watershed-controlled segmentation [21], [22] have been used. A threshold technique was 
accomplished to remove thoracic structures from the nodule candidate region, and it has the benefits of less 
storage space, quick processing speed, and ease in manipulation [5]. The technique was developed and 
constructed whereby a binary image with a higher grey level pixel than the threshold value was set to one, 
and the other pixels were set to zero. Thus, the aim was to simplify or change the representative of an image 
into something more accessible to analyze based on color, texture, and shape. Later, differences in these 
features made it possible for region isolation. 

The existing diagnosis method requires a human expert or radiologist to identify the lung region 
before detecting lesions. Many systems have been developed, and ongoing research is being conducted, to 
detect lung cancer. However, some systems do not satisfy the detection accuracy and need improvement. 
Therefore, there is room for improvement in the system's accuracy, sensitivity, and specificity. The objective 
of this study is to obtain the lung region from CT scan images and for this paper, the emphasis is on detecting 
the lesion of lung cancer. 

This study presents a new procedure for efficient and accurate lung area segmentation to improve 
the gaps left by the previous researchers. An image processing algorithms-based approach led to establishing 
a specific procedure for lung segmentation. MATLAB was used to create the proposed procedure. The paper 
organization is as follows: section | as the introduction, section 2 includes the research method, section 3 
consists of the results and discussions highlighting data collection and image segmentation, and section 4 
concludes the study and suggests some future research. 


2. RESEARCH METHOD 

This section explains the method used to detect the lung region using a new procedure of 
segmentation specifically for lung region segmentation. Figure 1 depicts a flowchart summarizing the 
proposed method. Based on Figure 1, the proposed method is divided into three (3) main stages, which are 
data collection, lung segmentation and performance evaluation. 
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Figure 1. Flowchart of the proposed method 
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Data from imaging department, Advanced Medical and Dental Institute (AMDI), Universiti Sains 
Malaysia (USM), Kepala Batas, Pulau Pinang was collected in the first stage. The second stage is called lung 
segmentation, which involves the new proposed procedure of lung segmentation. This will be the pilot study 
for improving those techniques in developing good image processing for quantifying lung segmentation in 
CT scan images. The results of the new proposed procedure of lung segmentation will be compared to the 
results of the modified watershed segmentation performed by [21]. This is because the study carried out 
by [21] is the most recent research, which has shown promising results and is more closely related to our 
study. Finally, the qualitative and quantitative analyses compared the proposed method with the current lung 
segmentation method. For this study, MATLAB 2019 was used because the software has many suitable functions 
for the project. The stages mentioned in the flow chart are explained in detail in the subsequent subtopics. 


2.1. Datasets 

This section explains the dataset used in this research. Data from subjects with underlying lung 
cancer were collected. Patients scanned at the imaging unit, AMDI, USM with underlying lung malignancy 
were included as the samples for this study. Images from picture archiving and communication systems 
(PACS) at AMDI that show evidence of nodule were collected and images that show evidence of tumor but 
had entirely normal CT Scans study (reported by a radiologist as having no nodule or any other imaging 
evidence of lung cancer) will also be included as a control group subjects. Figure 2 shows the CT lung cancer 
images from two patients using soft tissues density at a different angle. A total of 1,155 soft tissues density 
images from 5 subjects were used in this work. To comply with ethical regulations, the data was de-identified 
anonymized before being used in this study. 


Figure 2. Different angles of CT lung cancer images 


2.2. The proposed procedure for lung segmentation 

This section reports the main contribution of the study. Corresponding to the objective, this study 
focused on the lung segmentation from CT scan images using a new image processing algorithm. The new 
image processing algorithms-based is proposed by referring to the previous method. This will be the pilot 
study for improving those techniques/algorithms in developing a suitable image processing method for 
quantifying lung CT scan images specifically for lung segmentation. As mentioned, the results are compared 
with modified watershed segmentation because the study was done by [21], which is the most recent 
research, has shown promising results and is more closely related to our study. The outcome from both 
methods will be compared with the ground truth image for performance evaluation. The ground truth images 
are created by using Adobe Photoshop software. The area of the lung image is cropped manually without 
changing the file format and then validated by the radiologist. By removing the background of the lung, the 
specific area could be manually segmented to observe the lung image's distinguishing features. 

Based on previous studies carried out by [18], a thresholding-based method was used for lung 
segmentation with the integration of the modified watershed method in the morphological operation. In 
comparison, additional procedures have been introduced in the proposed method to improve further the 
performance of obtaining the lung region. The overall proposed procedure for the lung segmentation 
approach is shown in Figure 3. The details of each process as in pseudo-code representation of the proposed 
work is given below. 


The proposed procedure for lung segmentation 

Input: Let (x,y) be a color image of size WxL which is to be segmented by using our proposed method. 

Step 1: Convert the image I(x, y) into a normalized grayscale image, I(x, y) such that its intensity contains 
values between 0 and 1, or Ig (x, y) € [0,1]. 

Step 2: Determine the threshold value, thresh. The threshold value, thresh is used to separate lung CT images 
from the background by assigning an intensity value for each pixel such that each pixel is either classified as 
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an object point or a background point. The threshold value, thresh is determined by using global thresholding 
Otsu’s method on a center windowed image, Icw of size Ww X Wz; where, Wy = < and W, = E, This value 
will be used in step 3 to distinguish between the background or a lung object. 


Step 3: Convert the image Ig (x,y) into a black and white image Jgw(x,y). The black and white image, 
Ipw (x,y) is obtained by thresholding the intensities of Iç (x,y) based on the thresh value as in step (2): 


1 if Ię(x,y)> thresh 
low y) = Lo if Ig(x,y) < thresh D) 
where thresh is the threshold value determined using Otsu’s thresholding method in step 2. The result of 
thresholding is a binary image, where pixels with an intensity value of 1 correspond to objects, whereas 
pixels with value 0 correspond to the background. 
Step 4: Fills holes in the black and white image using morphological reconstruction algorithm as in [23]. The 
following procedure fills all the holes with 1 s. 


Igwr (x, Y)k = (Isw (x, Y)k-1 ® B) NAS) (2) 


Step 5: Extract the largest binary large object (BLOB). The following are the steps to obtain BLOB: i) assign 
labels to all connected components in the Igwr(x,y) to be Igg(x, y). Label connected components in a binary 
image that contain a bunch of objects as, [gwr{x,y). Pixels that belong to an object are denoted with 1/true 
while those pixels that are in the background are 0/false. Then, measures a variety of image quantities and 
features in a black and white image related to morphological image processing. The implementation of this 
function can be carried out in continuous and discontinuous regions, applying a wide variety of properties 
such as area and centroid; ii) compute the areas of all blobs; and iii) sort the computed areas in descending 
and select the first blob (with the largest area) only. 

Step 6: Apply the first image masking, Iy,(x, y) to the image that contains blob with the largest area. The 
procedure has two steps. First, convert the Igwr(x,y) image to a double-precision image, [pp(x,y). 


__double(iaw ry) 6) 
intmax(class(IpwFr(%y)) 


Ipp(x,y) = 
Then, apply image masking as (4). 


Im(x, y) = Ipp(x, y) ° Ipp(% y) (4) 


Step 7: Change the blob’s border from black to white. The following are the steps to convert the blob’s 
border from black to white: i) Erode /ga(x,y) using disk structuring elements of size a. Empirically a=10 and 
ii) invert the eroded images, /'(x,y) and use (5) to change its blob’s border from black to white be J’ga(x,y). 


PED) = Im (x,y) + l'gg(x,y) a 


Step 8: Set image inversion, I;yy (x, y). 


lny (x,y) = invert(I'(x,y)) (6) 


Step 9: Eliminate blobs with an area less than B pixels. The following steps are the process to eliminate the 
blobs: i) apply image inversion to the I;yy (x, y), represented as I';yy (x,y). Determine the area of each blob 
using the 8-connected neighbors and eliminate all blobs with a size less than B pixels. Empirically B=1500 
pixels. 

Step 10: Fills black pixel in the Z’mv (x,y) image using morphological reconstruction algorithm as in [23] to 
be Iciean (x,y). The following procedure fills all the holes with 1 s: 


liv Oe = Cerean% Y)k-1 ® B) N AS) (7) 


where J¢iean(x,y) is the black and white image, Z’my(x,y) is the black and white image with fills holes, B is the 
symmetric structuring element and A be a set containing one or more connected components. The algorithm 
terminates at iteration step k if I';yy (x, Y)k = Uctean(%) Y)k-1. The set I';yy (x,y), then contains all the 
filled holes. The set union of I';yy(x,y); and A contains all the filled holes and their boundaries. The 
intersection at each step with Af limits the result to inside the region of interest. 
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Step 11: Apply the second image masking, Im2(x, y) to the image that contains blobs with an area of more 
than 1500 pixels the steps involve first, convert the Igjeqgn(x,y) image to double precision image, 


Tican (x, y). 


double (I cirean &Y)) (8) 
intmax(class(I clean (x) 


l ciean (x,y) = 
Then apply image masking as (9): 
Imax, Y) = T crean% y) ° I(x, y) (9) 


Output: Im, (x, y). 
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Figure 3. The proposed procedure for lung segmentation 


2.3. Performance evaluation 

Initial testing and validation will be carried out to test and validate the formulated method, and to 
assess its performance and viability that corresponds to the objective. The evaluation will be carried out both 
in qualitative and quantitative manners. In a qualitative analysis, a group of experts in the field (radiologist 
and image processing expert) will evaluate the images produced from both methods; the new proposed 
procedure for lung segmentation, and the modified watershed segmentation and give their opinions on the 
results from which method they prefer. The questions asked participants to identify the best image 
segmentation technique for lung segmentation. Minor modifications and improvements may be required at 
this stage based on the recommendations from the radiologist to improve the implementation of this method 
and justify the proposed technique. 

The quantitative analysis is based on four statistical performance parameters which are accuracy, 
precision, recall, and F-score that are primarily used in image segmentation studies as described in research 
papers [11], [13], [18], [24], [25]. The accuracy test determines how well a diagnostic test identifies and rules 
out a specific condition. The percentage of lung region correctly segmented with the corresponding ground 
truth image is referred to as precision. The percentage of the actual lung region that was correctly segmented 
using these two methods is referred to as recall. Meanwhile, precision and recall frequently conflict with each 
other. A high level of sensitivity, for example, can result in a high recall but low precision. A low level of 
sensitivity, on the other hand, can result in high precision but low recall. As a result, achieving high precision 
and high recall at the same time is difficult. The F-score is the harmonic mean of precision and recall, which 
more accurately reflects an algorithm's accuracy. A high F-score is only with high precision and recall. The 
equations in Table | are used to validate this performance validation. 
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Table 1. Validation parameters 


Parameter Equation 
Accuracy TP +TN 
TP + TN + FP + FN 
Precision TP 
TP + FP 
Recall TP 
TP + FN 
F-score 2XPrecisionXRecall 


Precision + Recall 
*TP: True Positive, TN: True Negative, FP: False Positive, FN: False Negative 


3. RESULTS AND DISCUSSION 

This section further presents the findings of the study and discussions the results. It is divided into 
three main parts. The first part presents the results and discussion on the data collection. Next, the second 
part presents the lung segmentation results based on the proposed procedure for lung segmentation and 
watershed segmentation from the work done [26]. Finally, the last part of this section provides descriptions of 
the overall performance results based on the quantitative and qualitative analysis. For the lung segmentation 
process, the qualitative assessment of the proposed method was benchmarked against ground truth images. 
The qualitative evaluation of a comparative study on the proposed procedure for lung segmentation and the 
modified watershed segmentation image was based on a visual inspection and scoring by 22 experts from 
several hospitals including Hospital Sultanah Bahiyah (1), Hospital Serdang (1), Hospital Kulim (1), Hospital 
Tawau (1), Hospital Slim River (1), Advanced Medical and Dental Institute, Universiti Sains Malaysia (2), 
Universiti Teknologi Mara (4), Universiti Teknologi Malaysia (9), Universiti Kebangsaan Malaysia (1), and 
Universiti Tun Hussein Onn (1). Meanwhile, for quantitative evaluation, four (4) statistical evaluation 
metrics which are accuracy, precision, recall and F-score are measured. 


3.1. Data collection 

Data from 5 subjects with underlying lung cancer that has been collected in the retrospective study 
was used to test the proposed method. It should be noted that low image quality may affect research results 
[27], primarily quantitative analysis. Therefore, in this research, only high-quality CT images will be 
selected. Figure 4 shows an example of images from Figures 4(a) to 4(e) obtained from the PACS system at 
AMDI that was used in the evaluation. 


Figure 4. Example of images from (a) to (e) PACS system at AMDI; used in the evaluation 


3.2. Qualitative analysis 

The proposed lung segmentation procedure has been tested on 1,155 lung CT images. The sample 
outcome for lung segmentation is as shown in Figure 5. Referring to Figures 5, Figure 5(a) is the original 
lung image, Figure 5(b) is the ground truth image, Figure 5(c) is the result of modified watershed 
segmentation [21] and Figure 5(d) is the result of the proposed procedure. The ability to outline the desired 


A new procedure for lung region segmentation from computed tomography ... (Mohd Firdaus Abdullah) 


4984 O ISSN: 2088-8708 


important region in the image and identify the boundaries of the lung from surrounding thoracic tissue is the 
main criterion used in this study to evaluate the segmentation performance of the proposed procedure. Based 
on the segmentation results in Figures 5(c) and 5(d), from the result, it can be seen that both methods produce 
almost very similar results (both methods can identify the boundaries of the lung from surrounding thoracic 
tissue); but the proposed procedure shows the better result when compared to the ground truth particularly on 
the left side of lung region where the blood capillary (the undesired region) is not being included as the lung 
region. This satisfies the criterion of qualitative evaluation, where the proposed procedure can segment the 
critical area in the CT scan image better than the modified watershed segmentation. 

Next, the results from both methods were sent to a group of experts (eight (8) radiologists and 
fourteen (14) images processing experts) to interpret. The survey form has been distributed and the total 
average rank selection was calculated as in Table 2. From the analysis, as shown in Table 2, the proposed 
image processing procedure for lung segmentation became the most frequently selected method by the 22 
experts, as compared to the modified watershed segmentation. Based on the visual interpretation of the 22 
experts who completed the questionnaire, an average of 93.5% of the experts has selected the proposed 
procedure while the remaining 6.3% has selected the modified watershed segmentation for lung segmentation 
methods. From the results, the proposed procedure can be suggested to be used as a processing tool to 
segment the lung region. 


(b) (d) (e) 


Figure 5. Example of an original lung image: (a) ground truth image, (b) modified watershed segmentation 
[26] (c) and proposed procedure for lung segmentation, and (d) for patient 1 


Table 2. Qualitative results of two segmentation methods using visual interpretation 


Patient Average Selection (%) 
Modified Watershed Segmentation [26] (%) Proposed Procedure for Lung Segmentation (%) 
1 4.5 95.5 
2 9.1 90.9 
3 9.1 90.9 
4 4.5 95.5 
5 4.5 95.5 
Average 6.3% 93.5% 


3.3. Quantitative analysis 

As mentioned in section 2.3, on four (4) statistical performances: accuracy, precision, recall and 
F-score are used for quantitative analysis with comparison in different segmentation methods. To show the 
effectiveness of the proposed method, comprehensive experiments and investigations have been carried out. 
The study investigates performance evaluations of the segmentation methods with the ground truth image for 
the quantitative analysis. The analyses were performed based on the CT images as shown in Figure 5(a). As 
mentioned, the CT images were segmented using the modified watershed segmentation and the proposed 
procedure for lung segmentation. The results from the segmentation method as in Figures 5(c) and 5(d) was 
compared with ground truth image as in Figure 5(b) to identify the best segmentation method for lung 
segmentation. Table 3 shows the proposed procedure's quantitative results for lung segmentation and 
modified watershed segmentation. The proposed image processing procedure for lung segmentation produced 
an average value of accuracy, precision, recall and F-score of 99.9%, precision of 99.9%, recall at 99.8% and 
F-score at 99.74%, respectively. The modified watershed segmentation was slightly lower than the proposed 
procedure, with the average value of accuracy, precision, recall and F-score of 99.58%, Precision at 99.88%, 
recall at 96.24% and F-score at 98.02%, respectively. 

Finally, Table 4 summarizes the comparison of overall results for the average performance of the 
lung segmentation technique between the proposed procedure for lung segmentation and the modified 
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watershed segmentation method. The experimental results indicated that the proposed procedure for lung 
segmentation performed slightly better than the watershed segmentation. It gave the highest percentage for 
all the tested parameters. From the table, it is found that the performance of the proposed procedure for lung 
segmentation was increased by 0.02% to 3.5% in quantitative analysis. Therefore, this study proved that the 
proposed procedure for lung segmentation is the best lung segmentation method for lung cancer detection. 


Table 3. Quantitative results for the performance of lung segmentation for the dataset in Figure 5 


Patient Modified watershed segmentation [26] Proposed procedure for lung segmentation 
Accuracy (%) Precision (%) Recall (%) F-score (%) Accuracy (%) Precision (%) Recall (%) _ F-score (%) 
1 99.5 99.8 96.6 98.2 99.9 99.9 99.8 99.7 
2 99.8 99.9 99.3 99.6 99.9 99.9 99.9 99.9 
3 99.3 99.9 92.3 95.9 99.9 99.9 99.8 99.8 
4 99.7 99.9 97.7 98.8 99.9 99.9 99.9 99.9 
5 99.6 99.9 95.3 97.6 99.9 99.9 98.9 99.4 
Average 99.58 99.88 96.24 98.02 99.9 99.9 99.8 99.74 


Table 4. The average performance of the lung segmentation technique 


Method 
Average performance Modified watershed Proposed procedure for % 
evaluation segmentation [26] (%) lung segmentation (%) Discrepancy 
Accuracy 99.58 99.90 0.20 
Precision 99.88 99.88 0.02 
Recall 96.24 99.60 3.50 
F-score 98.02 99.74 1.70 


4. CONCLUSION 

The current study aimed to develop a new proposed image processing procedure for lung 
segmentation. The procedure has been developed to facilitate the radiologist segmenting the lung region for 
lung cancer detection using this new method. As a contribution to biomedical engineering, a new procedure 
that can detect the lung region was successfully developed using the method previously described. The study 
conducted experimental evaluations on subjects with underlying lung cancer that was collected in the 
retrospective study. Both qualitative and quantitative evaluations have been used to evaluate the experimental 
results. An improvement has been made from the previous study to improve the lung region's performance 
further. This study proved that by using a new proposed procedure for lung segmentation, the performance 
increased by 0.02% to 3.5% in quantitative analysis. The results from these two methods of segmentation 
were almost similar but the proposed procedure for lung segmentation had performed better than the 
watershed segmentation methods. For qualitative analysis, the proposed image processing procedure for lung 
segmentation became the most frequently selected method by the 22 experts compared to the modified 
watershed segmentation. Therefore, it seems that a new proposed procedure for lung segmentation gives the 
best results. The findings reported here shed new light on detecting lung cancer. Further research should be 
undertaken to explore the feature extraction of the lesion in the lung for lung cancer detection. 
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